Improving speech recognition performance in noisy environments
نویسندگان
چکیده
Speech recognition errors affect the performance of multimodal systems especially in noisy conditions. For that reason the use of speech recognition confidence score and time stamps is suggested. This paper deals with the impact of a signal enhancement method on these speech recognition’s parameters, in the presence of noise. By recognising noisy signals not only the word error rate (WER) increases but also the word-based confidence scores decline. Therefore, the role of speech recognition in multimodal systems weakens. The experimentation involves input signal corrupted by coloured noise with varying Signal-to-Noise Ratio. A non-linear spectral subtraction method (NSS) will be used in conjunction with the Continuous Speech Recognition system developed for the ERMIS project (IST-2000-29319) to quantify the impact of speech enhancement on speech recognition output.
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملThe Efficient Pmc for Robust Speech Recognition in Noisy Environments
The environment adaptive methods play an important part in improving the robustness of automatic speech recognition. In this paper, PMC is reviewed and improved to achieve the better performance. The experiments have been done based on the Cambridge’s HTK toolkit to implement the continuous Mandarin digit recognition in noisy environments
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملAn effective cluster-based model for robust speech detection and speech recognition in noisy environments.
This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance betwee...
متن کاملتشخیص لهجه های زبان فارسی از روی سیگنال گفتار با استفاده از روش های استخراج ویژگی کارآمد و ترکیب طبقه بندها
Speech recognition has achieved great improvements recently. However, robustness is still one of the big problems, e.g. performance of recognition fluctuates sharply depending on the speaker, especially when the speaker has strong accent and difference Accents dramatically decrease the accuracy of an ASR system. In this paper we apply three new methods of feature extraction including Spectral C...
متن کامل